Training agents via off-policy deep reinforcement learning (RL) requires a large memory, named replay memory, that stores past experiences used for learning. These experiences are sampled, uniformly or non-uniformly, to create the batches used for training. When calculating the loss function, off-policy algorithms assume that all samples are of the same importance. In this paper, we hypothesize that training can be enhanced by assigning different importance for each experience based on their temporal-difference (TD) error directly in the training objective. We propose a novel method that introduces a weighting factor for each experience when calculating the loss function at the learning stage. In addition to improving convergence speed when used with uniform sampling, the method can be combined with prioritization methods for non-uniform sampling. Combining the proposed method with prioritization methods improves sampling efficiency while increasing the performance of TD-based off-policy RL algorithms. The effectiveness of the proposed method is demonstrated by experiments in six environments of the OpenAI Gym suite. The experimental results demonstrate that the proposed method achieves a 33%~76% reduction of convergence speed in three environments and an 11% increase in returns and a 3%~10% increase in success rate for other three environments.
translated by 谷歌翻译
相对摄像头姿势估计,即使用在不同位置拍摄的一对图像来估算翻译和旋转向量,是增强现实和机器人技术系统中系统的重要组成部分。在本文中,我们使用独立于摄像机参数的暹罗体系结构提出了端到端的相对摄像头姿势估计网络。使用剑桥地标数据和四个单独的场景数据集和一个结合四个场景的数据集对网络进行培训。为了改善概括,我们提出了一种新颖的两阶段训练,以减轻超参数以平衡翻译和旋转损失量表的需求。将提出的方法与基于CNN的一阶段培训方法(例如RPNET和RCPNET)进行了比较,并证明了所提出的模型在Kings College,Old Hospital和St Marys上提出的翻译量估计提高了16.11%,28.88%和52.27%教堂场景分别。为了证明纹理不变性,我们使用生成的对抗网络研究了提出的方法的概括,将数据集扩展到不同场景样式,作为消融研究。此外,我们对网络预测和地面真相构成的异性线进行定性评估。
translated by 谷歌翻译
恶意软件是对计算机系统的主要威胁,并对网络安全构成了许多挑战。有针对性的威胁(例如勒索软件)每年造成数百万美元的损失。恶意软件感染的不断增加一直激励流行抗病毒(AV)制定专用的检测策略,其中包括精心制作的机器学习(ML)管道。但是,恶意软件开发人员不断地将样品的功能更改为绕过检测。恶意软件样品的这种恒定演变导致数据分布(即概念漂移)直接影响ML模型检测率,这是大多数文献工作中未考虑的。在这项工作中,我们评估了两个Android数据集的概念漂移对恶意软件分类器的影响:DREBIN(约130k应用程序)和Androzoo(约350K应用程序)的子集。我们使用这些数据集训练自适应随机森林(ARF)分类器以及随机梯度下降(SGD)分类器。我们还使用其Virustotal提交时间戳订购了所有数据集样品,然后使用两种算法(Word2Vec和tf-idf)从其文本属性中提取功能。然后,我们进行了实验,以比较两个特征提取器,分类器以及四个漂移检测器(DDM,EDDM,ADWIN和KSWIN),以确定真实环境的最佳方法。最后,我们比较一些减轻概念漂移的可能方法,并提出了一种新的数据流管道,该管道同时更新分类器和特征提取器。为此,我们通过(i)对9年来收集的恶意软件样本进行了纵向评估(2009- 2018年),(ii)审查概念漂移检测算法以证明其普遍性,(iii)比较不同的ML方法来减轻此问题,(iv)提出了超过文献方法的ML数据流管道。
translated by 谷歌翻译
尽管高容量计算平台的可用性日益增长,但实施复杂性仍然是神经网络现实部署的重要问题。这种关注并不仅仅是由于最先进的网络体系结构的巨大成本,也是由于最近朝着边缘智能和嵌入式应用中使用神经网络的使用。在这种情况下,网络压缩技术由于能够降低部署成本的能力,同时将推断准确性保持在令人满意的水平,因此引起了兴趣。本文致力于开发针对神经网络的新型压缩方案。为此,首先开发了一种新的$ \ ell_0 $ -norm正规化方法,该方法能够在培训期间诱导网络中的强烈稀疏性。然后,可以通过修剪技术来瞄准训练有素的网络的较小权重,可以获得较小但高效的网络。提出的压缩方案还涉及使用$ \ ell_2 $ -Norm正则化以避免过度拟合以及进行微调以提高修剪网络的性能。提出了实验结果,目的是显示拟议方案的有效性,并与竞争方法进行比较。
translated by 谷歌翻译
A Digital Twin (DT) is a simulation of a physical system that provides information to make decisions that add economic, social or commercial value. The behaviour of a physical system changes over time, a DT must therefore be continually updated with data from the physical systems to reflect its changing behaviour. For resource-constrained systems, updating a DT is non-trivial because of challenges such as on-board learning and the off-board data transfer. This paper presents a framework for updating data-driven DTs of resource-constrained systems geared towards system health monitoring. The proposed solution consists of: (1) an on-board system running a light-weight DT allowing the prioritisation and parsimonious transfer of data generated by the physical system; and (2) off-board robust updating of the DT and detection of anomalous behaviours. Two case studies are considered using a production gas turbine engine system to demonstrate the digital representation accuracy for real-world, time-varying physical systems.
translated by 谷歌翻译
Traditionally, data analysis and theory have been viewed as separate disciplines, each feeding into fundamentally different types of models. Modern deep learning technology is beginning to unify these two disciplines and will produce a new class of predictively powerful space weather models that combine the physical insights gained by data and theory. We call on NASA to invest in the research and infrastructure necessary for the heliophysics' community to take advantage of these advances.
translated by 谷歌翻译
We describe a Physics-Informed Neural Network (PINN) that simulates the flow induced by the astronomical tide in a synthetic port channel, with dimensions based on the Santos - S\~ao Vicente - Bertioga Estuarine System. PINN models aim to combine the knowledge of physical systems and data-driven machine learning models. This is done by training a neural network to minimize the residuals of the governing equations in sample points. In this work, our flow is governed by the Navier-Stokes equations with some approximations. There are two main novelties in this paper. First, we design our model to assume that the flow is periodic in time, which is not feasible in conventional simulation methods. Second, we evaluate the benefit of resampling the function evaluation points during training, which has a near zero computational cost and has been verified to improve the final model, especially for small batch sizes. Finally, we discuss some limitations of the approximations used in the Navier-Stokes equations regarding the modeling of turbulence and how it interacts with PINNs.
translated by 谷歌翻译
Most TextVQA approaches focus on the integration of objects, scene texts and question words by a simple transformer encoder. But this fails to capture the semantic relations between different modalities. The paper proposes a Scene Graph based co-Attention Network (SceneGATE) for TextVQA, which reveals the semantic relations among the objects, Optical Character Recognition (OCR) tokens and the question words. It is achieved by a TextVQA-based scene graph that discovers the underlying semantics of an image. We created a guided-attention module to capture the intra-modal interplay between the language and the vision as a guidance for inter-modal interactions. To make explicit teaching of the relations between the two modalities, we proposed and integrated two attention modules, namely a scene graph-based semantic relation-aware attention and a positional relation-aware attention. We conducted extensive experiments on two benchmark datasets, Text-VQA and ST-VQA. It is shown that our SceneGATE method outperformed existing ones because of the scene graph and its attention modules.
translated by 谷歌翻译
Existing analyses of neural network training often operate under the unrealistic assumption of an extremely small learning rate. This lies in stark contrast to practical wisdom and empirical studies, such as the work of J. Cohen et al. (ICLR 2021), which exhibit startling new phenomena (the "edge of stability" or "unstable convergence") and potential benefits for generalization in the large learning rate regime. Despite a flurry of recent works on this topic, however, the latter effect is still poorly understood. In this paper, we take a step towards understanding genuinely non-convex training dynamics with large learning rates by performing a detailed analysis of gradient descent for simplified models of two-layer neural networks. For these models, we provably establish the edge of stability phenomenon and discover a sharp phase transition for the step size below which the neural network fails to learn "threshold-like" neurons (i.e., neurons with a non-zero first-layer bias). This elucidates one possible mechanism by which the edge of stability can in fact lead to better generalization, as threshold neurons are basic building blocks with useful inductive bias for many tasks.
translated by 谷歌翻译
Bi-encoders and cross-encoders are widely used in many state-of-the-art retrieval pipelines. In this work we study the generalization ability of these two types of architectures on a wide range of parameter count on both in-domain and out-of-domain scenarios. We find that the number of parameters and early query-document interactions of cross-encoders play a significant role in the generalization ability of retrieval models. Our experiments show that increasing model size results in marginal gains on in-domain test sets, but much larger gains in new domains never seen during fine-tuning. Furthermore, we show that cross-encoders largely outperform bi-encoders of similar size in several tasks. In the BEIR benchmark, our largest cross-encoder surpasses a state-of-the-art bi-encoder by more than 4 average points. Finally, we show that using bi-encoders as first-stage retrievers provides no gains in comparison to a simpler retriever such as BM25 on out-of-domain tasks. The code is available at https://github.com/guilhermemr04/scaling-zero-shot-retrieval.git
translated by 谷歌翻译